perm filename OGDENI.PUB[2,TES] blob
sn#039426 filedate 1973-04-26 generic text, type T, neo UTF8
COMMENT ⊗ VALID 00008 PAGES
RECORD PAGE DESCRIPTION
00001 00001
00002 00002 .COMMENT OGDEN IMPLEMENTATION 2/25/73
00003 00003 .SEC STRUCTURES
00006 00004 .SEC EDITING CONSIDERATIONS
00007 00005 .SEC REPRESENTATION OF WORDS
00009 00006 .SEC LARGE FILES
00011 00007 .SEC DUMPING INTERNAL STRUCTURES
00013 00008 .STANDARD BACK
00014 ENDMK
⊗;
.COMMENT OGDEN IMPLEMENTATION 2/25/73 ;
.NOJUST
.BEGIN CENTER
.GROUP SKIP 10 ; TURN ON "{" ;
OGDEN IMPLEMENTATION NOTES
compiled by Larry Tesler
{DATE} ... Version 1
.END
.SKIP 8
.require "pubmac.nun[2,TES]" source_file
.standard front("I", "!.A")
.COUNT PAGE
.every footing(,{PAGE},)
.EVERY HEADING(,{DATE},OGDEN IMPLEMENTATION)
.SEC STRUCTURES
.SS TREE STRUCTURES
A "structure" is a tree (for now) in which each node refers to an atom or to
a molecules. For example, a paragraph
can occur in outline structure with sentences as direct subnodes
and again in galley structure with text lines as direct subnodes.
Although the structures imposed on a document may be quite different and
overlapping, each that we have thought of can be contorted into a special
case of a tree structure. In the most degenerate case (cross-reference
structure), each cross-reference is a tree of one leaf emanating from a
root. It is possible to generalize to directed graph structures, but
this seems an unnecessary complication.
Since there are many kinds of structures, the system will have to
have a notion of type. To avoid confusion with primitive data types,
we should find another term, say, "kinds".
.SS COORDINATE IDENTIFICATION
A node within a tree structure is best identified by a list of the names of the
atoms and molecules referenced by itself and by all of its direct ancestors.
This list provides the "coordinates" of the node within the structure.
A more complete identification of a node for global use is this coordinate list
prefixed by the identifier of the entire structure.
.SS COORDINATE TRANSFORMATION
A single atom or molecule may be referenced by many structures. It must
be possible to transform the coordinates of its referent node in one
structure to the coordinates of its referent node in any other structure.
One of the requirements of a newly invented structure is that the
transformations to all other structures be defined.
Some of these transformations may involve multiple stages.
It has not yet been proven that this is possible.
.SS GENERIC FUNCTIONS
Editing and formatting functions are "generic" with versions for
each kind of structure.
.SEC EDITING CONSIDERATIONS
.SS CREATION OF BOXES
From the galley position of a new box, OGDEN must compute for
each structure of the system the position at which new information must be
added because of the creation. Coordinate transformations are used
for this purpose. A structural update must also be considered for
components of the box, such as words and phrases that are being indexed.
.SS DELETION OF BOXES
When a box is deleted, all references to it are affected. The
generic "delete" functions for all affected structures are applied. The
coordinate transformation scheme had better take care of this correctly.
.SEC REPRESENTATION OF WORDS
.SS WORDS AS ATOMS
In the first version, we will represent each text word as a LISP atom.
This simplifies the implementation and gives us properties for free.
One atom will be needed for each distict word in the document.
The main disadvantage is that of the three words that LISPX
allocates for each atom, we only use at most two half-words:
PNAME and PROPERTIES. When we get to big documents, this may be a
disadvantage.
.SS WORDS AS STRINGS
An alternative is to make each word a string; but to hang a property
on it, we will have to use two hash tables. One takes the
small integer scramble of a string and yields a list of propertied
strings known to have that scramble; if the string is going to have
properties for the first time, it is added to this list. The
other hash table takes a propertied string and a property indicator
and yields a property value.
.SEC LARGE FILES
.SS LIST STRUCTURED FILES
Peter Deutsch has a package that allows cons pairs to be stored in
ascii on files, providing for car, cdr, and cons functions operating
on this data. Thus, when we run out of free storage in core, we
can start to use these files for infrequently accessed list structures.
.SS FRAGMENT FILES
Each fragment can be stored on a separate file. If we have space
problems but don't want to resort to Peter's package, we can at least
keep in core only the fragment currently being edited. If we
do use Peter's scheme, we can keep the fragment in the file and
only keep in core the part needed to maintain the display list.
.SS STRUCTURE FILES
Even if the fragment structure is on files, it will probably be
possible to keep the other structures in core at all times.
However, if this becomes impractical, they also can be put on
files.
.SEC DUMPING INTERNAL STRUCTURES
.SS DUMPER
Assuming that structures are stored internally, dumping them
on files between edits is a problem because internally
structures refer to subfragments by absolute addresses but
externally they must refer to relative locations. Thus, a
dumper will be needed to convert the absolute locations to
relative locations.
.SS LOADER
The loader does the inverse job of the dumper. To svoid having
to search a list N long to translate relative-N to an absolute
address, there can be an array which points to every 100th
element of a fragment. Thus, no more then 100 elements must be
searched.
.STANDARD BACK